Population size bias in Diffusion Monte Carlo 
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The size of the population of random walkers required to obtain converged estimates in DMC 
increases dramatically with system size. We illustrate this by comparing ground state energies of 
small clusters of parahydrogen (up to 48 molecules) computed by Diffusion Monte Carlo (DMC) 
and Path Integral Ground State (PIGS) techniques. We contend that the bias associated to a finite 
population of walkers is the most likely cause of quantitative numerical discrepancies between PIGS 
and DMC energy estimates reported in the literature, for this few-body Bose system. We discuss 
the viability of DMC as a general-purpose ground state technique, and argue that PIGS, and even 
finite temperature methods, enjoy more favorable scaling, and are therefore a superior option for 
systems of large size. 



I. INTRODUCTION 

Quantum Monte Carlo (QMC) methods are widely 
utilized to compute accurate thermodynamics of quan- 
tum few-body systems. The best known, and arguably 
most popular such method, is the Diffusion Monte Carlo 
(DMC), which has been extensively adopted over the 
past three decades, especially in the context of electronic 
structure calculations for atoms and molecules [T], but 
also in studies of light nuclei [5] , as well as of small Bose 
clusters such as ( 4 He)_/v or (H 2 )iv @]- 

On the other hand, the Path Integral Ground State 
(PIGS) [5H7] and related methods [8], have only relatively 
recently emerged as an interesting alternative to DMC. 
The most obvious advantage of PIGS over DMC is the 
straightforward, unbiased computation of ground state 
expectation values of quantities other than the energy, 
including off-diagonal correlations such as the one-body 
density matrix [51411]. not accessible within DMC. 

There is, however, another significant difference be- 
tween the two methods, one that may have so far been 
overlooked and/or understated in the literature, namely 
that the results obtained by DMC are intrinsically bi- 
ased by a necessarily finite population of random walkers. 
PIGS, on the other hand, is affected by no such limita- 
tion; we argue in this paper that this yields PIGS an edge 
over DMC, as systems of increasing number of particles 
are investigated. Specifically, we show quantitatively, us- 
ing a simple test system, that the bias arising from a 
fixed finite population is a rapidly increasing function of 
the number N of particles in the system (possibly lead- 
ing to an exponential scaling of the computational cost); 
furthermore, for a given N and for the typical numbers 
of random walkers commonly utilized, the bias can be 
both suprisingly large in magnitude, as well as difficult 
to control or remove, as the extrapolation of results ob- 
tained for different population sizes is not only very time- 
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consuming, it can be quite problematic as well. 

We illustrate the above conclusions by carrying out a 
systematic comparison of ground state energy estimates 
yielded by DMC and PIGS, for a small cluster of parahy- 
drogen (H2) molecules, including between N=13 and 
7V=48 molecules. We deem this a cogent test case, as 
a finite Bose cluster could be regarded as the paradigm 
physical system for which DMC ought to be applicable 
straightforwardly, almost as a "black box" . 

The remainder of this manuscript is organized as fol- 
lows: in the next section, we briefly review the basic 
differences between DMC and PIGS. Because both tech- 
niques are extensively discussed in the literature, we re- 
fer the reader to the appropriate references for a more 
in-depth illustration (see, for instance, Refs. 151 [T2]) . We 
then outline the model utilized in this work as a test case 
to perform calculations, and devote the bulk of this paper 
to a thorough presentation of the numerical results. We 
then discuss whether the bias due to a finite walker popu- 
lation may be the (main) cause of outstanding discrepan- 
cies between energy estimates for parahydrogen clusters 
reported in the literature, and offer our view on the im- 
portance of the population size bias on the scalability of 
DMC. On this point, we note that the hypothesis of an 
overall exponential scaling with N of the computational 
resources needed for DMC, has already been put forward 
by others [13] ■ 



II. METHODS 

PIGS and DMC have the same theoretical basis; in 
both, the exact ground state of a quantum system is 
projected out of an initial trial state, by simulating on 
a computer its evolution in imaginary time. Consider 
for definiteness a system of N identical particles of mass 
m; we assume for simplicity that the system obeys Bose 
statistics [13] . 

The quantum-mechanical Hamiltonian % of the system 
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representation, simply as statistical averages, i.e. 



A' 



n = n + v = -a v » + V ( R ) 



(i) 



where A = ti 2 /2m, R = r!r 2 ...rAr, are the positions of the 
N particles, and V(R) is the total potential energy of the 
system associated with the many-particle configuration R 
(this is typically the sum of pairwise interactions, but can 
be more general). The exact ground state wave function 
<&o(R) can be formally obtained from an initial trial wave 
function 'J't(-R) as 
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where 



G{R,R',/3) = (R\exp[-pn]\R') 



(2) 
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is commonly referred to as the imaginary-time propaga- 
tor. While Eq. ([2| is formally exact, for a nontrivial 
many-body problem one does not normally have access 
to G(R, R' , (3). However, using one of several available 
schemes, it is possible to obtain approximations for G, 
whose accuracy increases as (3 — !• 0; if G {R,R' , ff) is 
one such approximation, one can take advantage of the 
identity exp[-(3H] = (exp[-rH]) M , with (3 = Mr, and 
obtain G(R,R',(3) as 
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where R = R , Rm = R '■ Eq. Q is exact in the limit 
M — > oo (i.e., t — y 0) , which can be achieved in practice 
by extrapolating numerical results obtained with differ- 
ent values of M. 

The difference between PIGS and DMC lies in how 
the above procedure is implemented numerically. In 
PIGS, one generates sequentially, on a computer, a 
large set {X p }, p = I,2,...,P, of many-particle paths 
X = RqRi ... i?2M through configuration space. Each 
Rj = TjiTj2 ... rjjv is a point in 3./V-dimensional space, 
representing positions of the N particles in the system. 
These paths are statistically sampled, using the Metropo- 
lis algorithm, from a probability density 

,2M-1 x 

V{X) cc V t (Ro)*t(R2m) \ II Go(R i+ uRuT) \ (5) 

It is a simple matter to show E] that in the limits 
r — > 0, Mr — > oo, Rm is sampled from a probability den- 
sity proportional to the square of the exact ground state 
wave function $ (-R), irrespective of the choice of 
|15j . One can therefore use the set {R M } °f "midpoint" 
configurations Rm of the statistically sampled paths, to 
compute ground state expectation values of thermody- 
namic quantities F(R) that are diagonal in the position 
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an approximate equality, asymptotically exact in the 
P — > oo limit. The ground state expectation value of 
the energy can be obtained in several ways; it is particu- 
larly convenient to use the "mixed estimate" 
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which provides an unbiased result for the Hamiltonian 
operator H. 

Obviously, the total projection time (3 = Mr remains 
finite. It is straightforward to prove that the energy esti- 
mate E(f3), corresponding to a finite value of f3 is a strict 
upper bound on the exact ground state energy E a , which 
is approached monotonically in the f3 — » oo limit as 



E{(3) -E a ~ cexp(-/3 AE), 



(8) 



where AE is the energy gap between the ground state 
and the first excited state. 

By contrast, DMC implements the imaginary time evo- 
lution of the initial, trial state by introducing an 
importance-sampling transformation of Eq. ([2]), 

lim/^oo / dR' G(R, R', (3) V T (R')V G (R'), (9) 

where ^ G is a positive-definite guidance function and 
G(R,R',j3) = y G (R)G{R,R',f3)/y G (R'). Hereafter, 
as almost invariably done for Bose systems, we take 
= ^g- Eq. ([9| is simulated by a guided, diffusive 
random walk through configuration space of a popula- 
tion of N\y (ideally uncorrelated) walkers. Each walker 
performs successive transitions from its present config- 
uration R p to a new one R ni sampled from a diffusive 
probabilistic kernel contained in G (R n , R p ,t), with the 
addition of a drifting term which depends on ^>t- The 
aim of such a drifting term is allowing for importance 
sampling of the configurations, normally expected to re- 
duce considerably the variance of the estimates. There 
is no importance sampling in PIGS, on the other hand 

mm- 

A crucial feature of DMC is the fact that walkers, 
along the random walk, accumulate weights propor- 
tional to exp[- f dTE L (V T (R))T)], where E L (V T (R)) = 
T-L^t{R) /^t{R) is the local energy given by the trial 
wave function at the configuration R, visited at imagi- 
nary time r by a given walker. Typically, weights fluc- 
tuate considerably, both along the random walks, as well 
as within the population at any given time. Therefore, 
it proves convenient to reconfigure the population, ev- 
ery now and then during the calculation, so that walk- 
ers whose weights have become negligibly small are dis- 
carded, and copies are made of walkers whose weights are 
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larger. This reconfiguration, known as branching, is done 
in such a way that fluctuations in the weights of individ- 
ual walkers remain limited. In addition, control must be 
exerted in order to limit fluctuations in the population 
size (or total weight). Here too, it is possible to show 
that in the limit of long projecion time the population of 
walkers will sample a distribution of configurations pro- 
portional to $ (-R)^'t(-R), which can then be used to 
evaluate the exact ground state energy. 

The main advantage of this computational strategy, 
at least in principle, is that the projection time can be 
made very large with little computational effort. On the 
other hand, a bias is introduced in the procedure, as one 
must necessarily work with a finite population of walk- 
ers. In order for the algorithm to be exact, extrapolation 
to infinite population size must be carried out (see, for 
instance, Ref. H2lfor details). 

There has been surprisingly little work aimed at estab- 
lishing the magnitude of the finite population size bias on 
the computed expectation values, but some calculations 
have shown that it can be significant, particularly when 
trying to estimate expectation values of operators that do 
not commute with the Hamiltonian |17H19j . On general 
grounds, one can expect the bias to depend on the accu- 
racy of the guiding wave function V&t; if, hypothetically, 
the exact ground state wave function were known, then a 
single walker would suffice, as branching would disappear 
and the DMC calculation would reduce to a variational 
one, as expected. On the other hand, the less accurate 
the more significant the fluctuations of the local en- 
ergy associated to individual walkers, and with those the 
more important the effect of branching, from which the 
need for a larger population size ensues. Within PIGS 
there is no such bias, as there is no population and no 
branching. 



III. MODEL AND CALCULATIONS 



enjoy enhanced stability over others. This has been in- 
vestigated in a number of works by computation of the 
total ground state energy E(N), as a function of cluster 
size N, and by looking for isolated peaks of the chemical 
potential n(N), defined as 

H(N) = E(N - 1) - E(N) (10) 

Clearly, the precise identification of magic clusters re- 
quires a sufficiently accurate determination of E(N), 
which is an extensive quantity. At present, there exist 
outstanding discrepancies between different ground state 
results obtained by DMC [HHS], PIGS [26], as well as 
by extrapolating to T = results at finite temperature 
[20] . This point is discussed in detail below, where we 
argue that the population size bias in DMC is likely at 
the root of such a discrepancy between different calcula- 
tions, at least for the largest size clusters. 

The quantum-mechanical many-body Hamiltonian is 
given by Eq. ([!]), with A = 12.031 KA 2 and the following 
choice for the potential energy V(R): 

V(R) = Y J v(n ] ) (ll) 

Here, v is the potential describing the interaction between 
two hydrogen molecules, only depending on their relative 
distance. It should be made clear at the outset that such 
a simple model potential is not the most accurate choice 
that one could make; three-body terms are known to be 
quantitatively important. However, since the aim of this 
paper is mostly methodological, we limit ourselves to the 
use of a pair potential, and select that by Silvera and 
Goldman [27 , for consistency with existing calculations 
against which we are interested in comparing our results. 
We have computed ground state energies of clusters of 
size ranging between iV=13 and 7V=48, using both PIGS 
as well as DMC. 



Our physical system of interest, for which we present 
all the numerical results discussed here, is a self-bound 
cluster of N parahydrogen molecules, regarded as point 
particles, moving in three dimensions. This system has 
been the subject of much theoretical investigation over 
the past few years, as it is believed to display an in- 
teresting interplay of classical and quantum-mechanical 
physical effects [2U]. Clusters of parahydrogen of less 
than 20 molecules are liquidlike and superfluid at low T; 
if the number of molecules is between 20 and 40, clusters 
can "quantum melt" at low temperature i.e., go from 
a solidlike arrangements, with molecules sitting at pre- 
ferred sites, to a superfluid one, in which they are essen- 
tially delocalized throughout the cluster [2TJ [22] . More- 
over, some specific "supersolid" clusters, superfluid and 
solid behaviours appear to coexist in the T — » limit 
[23]. 

An interesting issue is whether there exist clusters of 
specific sizes (also referred to as "magic numbers") that 



A. PIGS 

Our PIGS calculations are based on a trial wave func- 
tion of the Jastrow type 

9 P (R) = J] exp[-u(r«)] (12) 

i<j 

where = |rj — r\,|, and with u{r) = a/r 5 , a being a 
variational parameter whose value was set to 375 A 5 for 
all clusters studied here. This wave function is the same 
employed in previous studies based on PIGS [55], albeit 
with a different value of the parameter a. It is not meant 
to describe a finite self-bound system, in that it only in- 
cludes short-range correlations arising from the repulsive 
core of the intermolecular potential. A variational cal- 
culation based on such a trial wave function yields an 
unbound cluster, i.e., E(N) — 0. One of the most impor- 
tant aspects of PIGS is precisely its ability to extract the 



correct physics even if the initial trial wave function is 
chosen less than optimally; for example, in Ref. |16j it is 
shown that an accurate ground state energy estimate for 
solid helium can be obtained by PIGS even on setting the 
trial wave function equal to a constant. This is in stark 
contrast to DMC, for which an appropriate choice for ^>t 
often proves crucial to the accuracy and reliability of the 
calculation. 

In this work, the same approximation for G(R, B! , r) 
utilized in Refs. 7, 26J was chosen, namely: 



G {R,R',t) = G f (R,R',t) exp 



2tV(R) 



(13) 



where Gf{R, R' , t) is the analytically known propagator 
for a system of non-interacting particles and 



r 2 h 2 



N 



V(Rj) = 2V(Rj) + — ^iViRj)) 



(14) 



if j is odd, whereas V(Rj) = V(Rj) is j is even. It is 
G(R, R', t) = G (R, R', t) + C(r 5 ). The path sampling 
techniques are the same described in Ref. [7J. 



B. DMC 

For the DMC calculations we adopt the same trial func- 
tion used in Ref. [25], ^di(R) = Yl%<j ex P[ — w ( r ij)]i 
where the pair pseudopotential w(r) = /3/r 5 + br/N 
differs from u(r) of Eq. ( 12 ) for the need to include 



a linear term which prevents molecules from evaporat- 
ing; the variational parameters are [25] j3 — 294 A 5 and 
b = 2.79 A -1 . We also consider a much better trial func- 
tion ^D2(R) with a more flexible pair pseudopotential 
and a three-body correlation of the standard form [28] . 
both optimized [29] for each cluster size. For TV = 48, 
the variance of the local energy of <P ui is smaller than 
that of i&Di by over an order of magnitude. Significantly 
better trial functions can only be obtained by including 
four- and five-body terms |30j . but this route seems to 
be viable only for very small systems. The details of 
the DMC simulations are essentially those described in 
Ref. [12] , notably we utilize the standard approximation 
for the propagator, supplemented with the well-known 
"rejection" scheme, which has been shown to afford con- 
vergence of the numerical estimates with a significantly 
greater time step than would be otherwise required. We 
only use a slightly different translation of weights into 
multiplicity during the branching reconfiguration. 

Before we discuss the results, a point must be made 
clear, namely that our purpose here is to carry out an 
unambiguous, unbiased comparison of energy estimates 
obtained by DMC and PIGS. Because we are considering 
a Bose system, for which the ground state wave function 
is positive-definite, the numerical results given by the two 
algorithms are expected to coincide, within statistical un- 
certainties, once extrapolations to infinite projection time 
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FIG. 1: (Color online). Estimates of the ground state en- 
ergy per particle for a cluster of 48 parahydrogen molecules, 
computed by PIGS as explained in the text, for varying val- 
ues of the time step r (in K -1 ). The total projection time is 
f3 =1 K -1 . The dashed line shows a quartic fit to the data, 
extrapolating to a r = limit of -38.14(1) K. 



for PIGS, infinite size of the population sample for DMC, 
and zero time step for both are carried out. Implementa- 
tion details of either method, such as the approximation 
adopted for the short-time propagator or the choice of 
the moves in the random walk, only affect the efficiency 
of the calculations, and are of no particular concern here. 
On the other hand the population bias of DMC, which is 
the focus of our study, depends on the quality of the trial 
function (which cannot be arbitrarily improved in gen- 
eral) to such an extent that the extrapolation to infinite 
number of walkers can be problematic or even unfeasible 
in practice. 



IV. RESULTS 

In order to establish our main finding, we begin by il- 
lustrating results of calculations of ground state energet- 
ics for the largest cluster studied here, comprising 7V=48 
parahydrogen molecules. Specifically, we compare PIGS 
and DMC results. 

Figure [T] shows estimates of the ground state energy 
per parahydrogen molecule e = E/N obtained by PIGS 
with a total projection time /3=1 K _1 , and with different 
values of the time step r. A fit to the data based on the 
expression e(/3,r = 0) = e(/3,r) + ct 4 , justified by the 
use of the propagator ( 13 ), yields a value extrapolated to 
equal to e((i 

oo, r = 0) 



r = U equal to e(p = 1 K , t = 0) = -38.14(1) 

As mentioned above, an estimate for e(/3 
can be obtained by extrapolating results obtained with 
different projection times |31j . The result is shown in Fig- 
ure [2] The asymptotic value is indistinguishable, within 
statistical errors, from that at f3 = 1 K _1 . Our energy 
estimate is slightly higher than that offered in Ref. [26], 
namely —38.22(3) K, for a projection time j3 = 0.8 K _1 
and with a time step r = 1.5625 x 10~ 3 K _1 . For the 
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FIG. 2: (Color online). Extrapolation to the j3 — > oo limit of 
estimates of the ground state energy per particle for a clus- 
ter of 48 parahydrogen molecules, computed by PIGS as ex- 
plained in the text, for varying values of the projection time 
p. Estimates shown are extrapolated to the r — > limit. 
Dashed line is the fitting curve described in the text. 



same time step, our estimate is —38.17(2) K (see Figure 
[TJ), compatible with that of Ref. [26] if statistical uncer- 
tainties are taken into account [35]. On the other hand, 
it is surprisingly almost 1 K below the most recent DMC 
estimate for this cluster, namely —37.28(3) K, by Sola 
and Boronat [25] . 

Such a discrepancy can hardly be regarded as "negligi- 
ble" , considering that the value of the chemical potential 
/x(iV) (Eq. [Top , used to assess cluster stability, is com- 
puted by subtracting two extensive energy values, i.e., 
associated to whole clusters. For instance, a systematic 
error of the order of 0.9 K per molecule results into one 
on the total energy of the N — 48 cluster of approxi- 
mately 45 K, which is very close to the value of /x quoted 
in Ref. [55] for this cluster. 

In order to shed light on this worrisome disagreement 
between numerical data advertised as "exact" , we have 
performed DMC calculations for the same cluster, as ex- 
plained above. All the results presented so far are calcu- 
lated with a time step of 2.0 x 10~ 4 K _1 . We find that 
the time step error on the energy per particles is similar 
for all clusters, in the range of sizes considered here. It 
does depend on the trial function, however. Our esti- 
mates are -0.07 K for \& u\ and less than 0.01 K for ^ D2- 
Figure [3] shows the ground state energy per particle for 
a cluster of 48 parahydrogen molecules as a function of 
the number of walkers, calculated with the trial functions 
<F di and \& D2 of Sec. |IIIB[ If the walkers were uncor- 
related, the population bias would vanish as 1/Nw |12j . 
This is clearly not the case: for both trial functions, we 
can fit data obtained with Nw between 200 and 200,000 
(not all of this range is shown in Figure |3| with the ex- 
pression e(Nw) = e(oo) -I- cN^, and the optimal value of 
the exponent is 0.342 for the "good" trial function ^D2, 
and as low as 0.202 for the "poor" trial function *Bdi, 
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FIG. 3: (Color online). Ground state energy per particle for 
a cluster of 48 parahydrogen molecules as a function of the 
number of walkers Nw, computed by DMC using two different 
trial functions: (diamonds, blue online) and (circles, 
red online). The lines are power law fits to the data, and the 
open symbols show the extrapolated values at 1/Nw = 0. 
The time step utilized here is 0.0002 (in K" 1 ). 



the reduced % 2 being smaller than 1 in both cases. 

There are several things to note here. First and fore- 
most, the result e = -37.278 ± 0.028 reported in Ref. [35] 
for N = 48, allegedly based on data "analyzed to reduce 
any sistematic bias to the level of statistical noise", is 
outside the scale of the figure. The DMC energies of Ref. 
[25|are systematically higher than those reported in Ref. 
1261 the difference increasing (non-monotonically) with N; 
we find it to be greatest (~ 0.9 K) at iV=48, while it is 
of the order of 0.2 K per molecule for iV=30, and 0.4 K 
per molecule at iV=40. 

In Ref. EU which reports DMC energy estimates (essen- 
tially identical with those of Ref. |2"5[) for clusters of size 
up to N=A0, authors observe a "marked" effect of pop- 
ulation size, on performing calculations for Nw ranging 
from 500 to 2,000, suggesting nevertheless that its overall 
effect on the chemical potential might be negligible, pre- 
sumably due to some expected (fortunate) compensation 
of error. 

Indeed, our DMC values are similar to those of Refs. 
[53J[55] when we take Nw ~ 1500; the problem is that the 
small slope of the e(Nw) curve around such a value of 
Nw is highly deceiving, as the slope actually appears to 
diverge, as 1/Nw — ► 0, as clearly shown by data in Figure 
[3] Therefore, not only is extrapolation of results which 
depend so dramatically on the number of walkers clearly 
problematic - one can be easily led to believe incorrectly 
that convergence with respect to Nw has been reached, 
by focusing on relatively narrow a range of Nw, as in 
Ref. [2H (it does not help if discrepancies with published 
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results by others are simply ignored). The extrapolated 
value agrees, as it should, with the PIGS result (within 
two standard deviations, for ^02)1 but the amount of 
computer time needed to reach a given statistical accu- 
racy is much larger for DMC than for PIGS. 

For N = 23 the population bias is still definitely not 
linear in 1/Nw, but its magnitude is much smaller than 
for N — 48; deviations from the linear behavior become 
hard to detect for TV = 13. We can define the number of 
walkers N\y needed to observe convergence of the energy 
to a precision e via the relation e{Nyy) — e(oo) = e. For 
e = 0.01 K we find N w = 5000 for N = 13 and as much 
as N w = 100 millions for N = 48 if we use V D2 - A 
sensible estimate for N = 48 using is not even pos- 
sible from our simulations, which in this case, even using 
up to 200,000 walkers, still leave a large uncertainty in 
the best-fit exponent of e(Nw). In terms of the compar- 
ison between the DMC El ESI and the PIGS HO results 



(see Table IV), which initially motivated this work, the 
dependence of the popolation bias on the system size par- 
allels and presumably explains the similar dependence in 
the observed discrepancies. 



N DMC EH 



DMC 



PIGS 



13 -20.952(16) 
23 -28.111(12) 
36 -33.804(19) 
48 -37.278(28) 



-20.98(1) -21.02(1) 

-28.15(1) -28.16(1) 

-34.09(2) -34.13(1) 

-38.15(2) -38.14(1) 



TABLE I: Ground state energy per molecule (in K) for dif- 
ferent parahydrogen clusters, computed by DMC (Ref. |25] 
and this work) and PIGS. PIGS estimates are extrapolaed 
to the r — > limit, for a total projection time /3 — 1 K _1 . 
DMC estimates obtained in this work are extrapolated to the 
1/Nw —> limit as explained in the text. Statistical errors, 
in parentheses, are on the last digit(s). 



V. DISCUSSION 



pounds with a relatively low quality of the trial function; 
the strength of the interparticle potential makes it dif- 
ficult to devise and use much more accurate trial wave 
functions than ■ As a result, N = 48 - a very modest 
size for a boson system |33j- turns out to be already a 
demanding calculation. 

On this point, it is interesting to note that, in a previous 
study [7] , a comparison of ground state energy estimates 
for bulk liquid 4 He obtained with PIGS and DMC, found 
that PIGS yielded consistently lower results, and that the 
difference between PIGS and DMC results increases with 
density. The suggestion was already made back then that 
the use of a finite population in DMC, comprising only 
a few hundred walkers in those DMC calculations, seems 
very likely to be the cause of the discrepancy. 

It should also be mentioned that there exists an alter- 
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FIG. 4: Ground state energy per particle for a cluster of 48 
parahydrogen molecules as a function of Tp, computed by 
DMC using the * 02 trial function with Nw = 12800. Error 
bars are only shown for a few points. The horizontal line is 
the extrapolated value at 1/Nw = from Figure [3] The time 
step utilized here is 0.0002 (in KT 1 ). 



Although the results shown above illustrate rather 
clearly that the bias arising from the control of the pop- 
ulation is significant, it could be argued that the use 
of a more accurate trial wave function (e.g., ^D2 in- 
stead of ^di m the case shown in Figure [T]), consider- 
ably improves the convergence, and therefore it is unclear 
whether the problem should be ascribed to a finite popu- 
lation, or rather to a poor choice of ^x- As it turns out, 
although a superior trial wave function can indeed alle- 
viate the problem of finite population bias, this should 
not induce much optimism on the scalability of DMC in 
general. For, the behavior illustrated in Figure [3] is ulti- 
mately due to statistical correlation between walkers, in 
turn induced by large fluctuations of the branching term 
exp(— tEl(R)). Since El is an extensive quantity, one 
can expect -and does indeed observe [13] - an extremely 
poor asymptotic scaling of the efficiency of DMC with the 
system size. For molecular hydrogen, this problem corn- 



native procedure, one that in principle could remove the 
bias due to a finite population of walkers in DMC without 
requiring an extrapolation with Nyy- One can carry out 
the DMC simulation with a single target value of Nw 
and store the renormalization factors /j of the popula- 
tion size along the simulation |12j . with i a time index. 
The bias would be eliminated by accumulating weighted 
averages, the weight being defined for each configuration 
as the inverse of the product of all factors fa from the 
beginning of the simulation up to the current time. 
While in principle this procedure completely undoes the 
effect of the population control, in practice it leads to 
unacceptably large variance. Thus, one keeps in the 
weighted averages only the product of the last Kp factors 
fi, and seeks convergence of the results by increasing the 
"correction time" Tp = KpT. However, one is bound to 
face severe efficiency problems whenever the population 
size bias is strong. Figure [4] illustrates an attempt at cor- 
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recting the population size bias for the ground state en- 
ergy of a cluster of 48 parahydrogen molecules calculated 
with 12800 walkers. From the biased value at Tp = the 
energy is expected to converge for large times to the exact 
value (the extrapolation of Figure [3j shown in Figure [4] 
by the horizontal line). However no convincing evidence 
of convergence can be detected before the statistical error 
grows as large as the bias itself, despite this simulation 
being 8 times longer than that performed for the single 
point at N w = 12800 of Figure ^ 

In conclusion, we have presented numerical evidence 
to the effect that the bias arising from a finite popula- 
tion size in DMC calculations is the most likely cause of 
discrepancies reported in the literature between ground 
state energy estimates for Bose systems obtained with 
DMC and Metropolis-based methods such as PIGS. Al- 
though a complete removal of the bias (whose magni- 
tude appears to have been generally underestimated, or 
in any case not fully appreciated) is possible in principle, 
the computational resources required grow significantly 
with system size. In fact, although the system sizes for 
which we are presenting data in this work are too small 
to make that conclusion, they are strongly suggestive of 
exponential scaling. Obviously, although we have illus- 
trated quantitatively this conclusion on a Bose system, it 
applies equally to fermions, there being nothing in the ar- 
gument expounded here that depends on quantum statis- 
tics. If anything, there are reasons to expect that the use 
of the popular fixed-node approximation to circumvent 



the sign problem may conceivably worsen the problem 
of fluctuating local energy, which is at the root of the 
population bias. Thus, while the choice between the two 
methods has been so far largely regarded as one of "per- 
sonal taste" , path integral methods, requiring no walker 
population, may prove a better choice for systems of large 
size, how large depending on the quality of the trial func- 
tion. 

Finite temperature methods such as Path Integral 
Monte Carlo, which do not require a population of walk- 
ers, also do not suffer from the kind of bias discussed 
in this work, that affects instead any population-based 
procedure such as GFMC (including for lattice Hamilto- 
nians) and DMC. Thus, although one may naively think 
that ground state methods would necessarily be better 
suited for T=0 calculations, PIMC may in fact also prove 
a better option than DMC in some cases, given the signif- 
icance of the population size bias. It is worth mentioning 
that for the specific physical system discussed her, PIMC 
yields estimates in the T — > limit consistent with those 
furnished by PIGS [20H22]- 
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